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The last decade has seen max-stable processes emerge as a common tool 
for the statistical modelling of spatial extremes. However, their application 
is complicated due to the unavailability of the multivariate density function, 
and so likelihood-based methods remain far from providing a complete and 
flexible framework for inference. In this article we develop inferentially prac- 
tical, likelihood-based methods for fitting max-stable processes derived from a 
composite-likelihood approach. The procedure is sufficiently reliable and ver- 
^ | satile to permit the simultaneous modelling of joint and marginal parameters 

in the spatial context at a moderate computational cost. The utility of this 
£S) . methodology is examined via simulation, and illustrated by the analysis of U.S. 

precipitation extremes. 
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§ ; 1 Introduction 

A common objective of spatial analysis is to quantify and characterise the behavior 
rS [ of environmental phenomena such as precipitation levels, windspeed or daily temper- 
ature s. A number of generic appro ac hes to spatial modellin g have been developed 
(e.g. iBarndorff-Nielsen et all (jl998l ): ICressiel fjl993l ): iRiplevI too4 ) ). but these are 



not necessarily ideal for handling extremal aspects given their focus on mean process 
levels. Analyses of spatial extremes are useful devices for understanding and predict- 
ing extreme events such as hurricanes, storms and floods. In light of recent concerns 
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over climate change, the use of robust mathematical and statistical methods for such 
analyses has grown in importance 

While the theory and statistical practice of univariate extremes is well developed, 
there is much less guidance for the modelling of spatial extremes. This is problem- 
atic as many environmental processes have a natural spatial domain. We consider 
a temporal series of componentwise maxima of process measurements recorded at 
k — 1, . . . , K locations, within a contiguous region. Observations {y n ,k} each denote 
the maximum of m samples over n = 1, . . . , N temporal blocks. For example, for 
daily observations, m = 366 implies the {y n ,k} describe process annual maxima. 

The spatial a nalogue of multivariate extreme value mode l s is the class o f max- 
stable processes ( Ide Haanl . 1 1984 Ide Haan and Pickandd . Il986l ; iResnickl . 119871 ). Max- 
stable processes have a similar asymptot i c motivation to the univariate Generalised 
Extreme Value (GEV) f von Mises . 1954 ; Jenkinson . 19551 )). providing a general ap- 
proach to modelling process extremes incorporating temporal or spatial dependence. 
Statistical methods fo r max- s table p rocess es and data analysis of pra c tical prob- 



lems are discuss e d by ISmithl ( Il990l ). IColesI (119931 ). IColes and Walshawl (119941 ) and 



Coles and Tawnl (119961 ). Standard likelihood methods for such models are compli- 



cated by the intractability of the multivariate density function in all but the most 
trivial cases. This presents an obstacle in the use of max-stable processes for spatial 
extremes. 

There is a lack of a proper inf e rentia l framework for the analysis of spatial extremes 
(although iDe Haan and Pereiral (120061 ) describe some non-parametric estimators). In 
this article we develop flexible and inferentially practical methods for the fitting of 
max-stable pro cesses to spati al data based on non-standard, composite likelihood- 
based methods iLindsayi (119881 ). An appealing feature of this approach is that the es- 
timation of GEV marginal parameters can be performed jointly with the dependence 
parameters in a unified framework. Accordingly, there is no need for separate estima- 
tion procedures. With highly-structured problems such as max-stable processes, this 
approach produces flexible and reliable results with a moderate computational cost. 

The article is organised as follows: Section [2] reviews the theory of max-stable 
processes and its relationship to spatial extremes. Our composite likelihood approach 
is developed in Section [3] and Section [4] evaluates the method's performance through 
a number of simulation studes. We conclude with an illustration of a real extremal 
data analysis of U.S. precipitation levels. 



2 Max-stable processes and spatial extremes 
2.1 Max-stable processes 

Max-stable processes provide a natural generalisation of extremal dependence struc- 
tures in continuous spaces. From this, closed-form bivariate distributions can be 
derived. 
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Definition. Let T be an index set and {Yi(t)}t£T, i = 1, . . . ,n be n independent 
replications of a continuous stochastic process. Assume that there are sequences of 
continuous functions a n (t) > and b n (t) G M. such that 



Y{t) 



lim 

n— *oo 



max^i^(t) -b n (t) 
a n (t) 



t G T. 



If this limit exists, the limit process Y(t) is a max-stable process Ide Haanl (119841 ) 



Two properties follow from the above definition |Pe Haan and Resnickl (119771 ). Firstly, 
the one-dimensional marginal distributions belong to the class of generalised extreme 
value distributions (GEV), Y ~ GEV(/i, A,£) with distribution function 



F{y;n,\,0 = exp 



ay - n) \ ~ m 



A 



-oo < /i, £ < oo, A > 0, 



wher e a+ = max(0; a) and p, A a nd £ are respectively location, scale and shape param- 
eters Fisher and Tippettl (119281 ). Secondly, for any K — 2, 3, . . ., the i^-dimensional 



marginal distribution belongs to the class of multivariate extreme value distributions. 

W.l.o.g. if a n (t) = n, b n (t) = Vt, then the corresponding process, {Z(t)} t £T, 
has unit Frechet margins, with distribution function F(z) = exp(— \jz\z > 0. This 
process is obtainable as standardisation of {Y(t)} t( z T through 



{Z(t)} 



where /i(t), ^(t) and A(t) > are now continuous functions. The process Z is still a 
max-stable process. If Z is also stationary, the proce ss may be expressed through it's 
spectral representation lde Haan and Pickandsl (119861 ). 

In detail, let {X,-, Uj}j>± be a Poisson process, II, on M n x E + , with counting 
measure II(-) := E^x,, (/,-)(•) and intensity measure v(dx) x u~ 2 du, where I(x ji c/ i )(^4) 
is the indicator function of the random number of points falling in a bounded set 



A C 



and v is a positive measure. For a nonnegative measurable (for fixed 



teT) function f(x — t) such that f Rn f(x — t)u(dx) = 1, Vt G T the stochastic process 

Z{t):= maxiUrfiXj-t)}, teT, (1) 

J=l,2,... 



is a stationary max-stable process Ide Haanl (ll984T ). ISmithl (Il990f ) reinterprets this 
process in terms of environmental episodes such as storm phenomena, in which 
U, X and / repres e nt re spectively storm magnitude, the center and the shape. 
Schlather and Tawnl (120031 ) term this the storm profile model. 
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For a finite set of indexes ti,...,tx £ T and positive thres holds Zi, . . . , zk for 
K G N, the distribution of the random vector Z(ti), . . . , Z(t K ) is Ide Haanl (119841 ) 



Pr{Z(t k ) < z k , k = 1, . . . ,K} = exp 



J f{x ~ tk) ■ 

max < > i>{dx) 

\<k<K | z k 



(2) 



It then follows that the marginal distributions are unit Frechet: 

Vi{Z{t) <z} = exp ( -z' 1 I f(x - t)u(dx)) = exp(-l/z) 



Alternative spectral representations of max-stable processes exist ( jSchlatherl . 120021 ). 



2.2 Extremal coefficients 

Given n independent realisations of a random vector Y € R d , the joint distribu tion 
of componentwise maxima satisfies (IDe Haan and Resnickl . Il977t iResnickl . 119871 ) 



I. 



Pr -l max max Yjf'/n < z 

k j=l,...,n 



Pr <^ max In < z 

1 j=l,...,n 



exp(—9/z), z > 0, 



for k — 1,2, ... ,K and common threshold z, where the rightmost term is a Frechet (9) 
distribution. The parameter 1 < 9 < K is the extremal coefficient and it measures 
the extrema l dependence between the margins, an important practical quantity in 
applications ISmithl (ll990l ). The information in the extremal coefficient reflects the 
practical number of independent variables. If K is finite then 9 = 1 indicates complete 
dependence, whereas 9 = K demonstrates full independence. 

In the max-stable process framework, from ([2]), for all z > we have 



Pr{Z(t k ) <z,k = l,...,K} = exp(-0/z), 



and so 



9 



max { f (x — tk)} vidx) 

n l<k<K 



where 9 again represents the effective number of independent variables. ISchlather and Tawn 
f )2003l ) discuss the extremal coefficient within a max-stable context. 



2.3 Spatial models 

Suppose now that T C M 2 and that {Xj}j>i are random points in M 2 . While for 
K > 2, the general K- dimensional distribution function under the max-stable process 
representation ([T|) permits no analytically tractable form, a class of bivariate spatial 
models is available when the storm profile model, /, is a bivariate Gaussian density 
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and fx is a Lebesgue measure flSmithl . fl990l : |Pe Haan and Pereiral . |2006| ) . In this case, 
for locations tj and tj the bivariate distribution function of {Z(0), Z(h)} is 



Pr{Z(0) < Zi,Z(h) < Zj } 



exp 



a(h) 



+ 



a{h) Zi 



a(h) 



a(h) 



log- 



(3) 



where h = (tj — tj) T , is the origin, $ is the standard Gaussian distribution function, 
a(h) = (h T Y,~ 1 h) 1 / 2 and £ is the covariance matrix of /, with covariance 0*12 and 
standard deviations <J\,a2 > 0. A derivation of ([3]) is in Appendix A. 2. A general 
max-stable process w ith a Gaussia n storm profile model, /, is termed a Gaussian 
extreme value vroce ss ISmithl (119901) . whereas the specific model ([3]) is the Gaussian 
extreme value model IColesI (119931 ). 

Second-order partial derivatives of ([3]) yield the 2-dimensional density function 



f{z u zj) = exp 



$(w(h)) $Kh))lf/$(„(h)) y?Mh)) y>(v(h)) 



a(h)z 2 a(h)ziZj 
v(h) ip(w(h)) w(h) (p(v(h))^ 




^Kh)) , ¥>(«(h)) <^(h)) 



a(h)z ? 2 



a(h)^j^- 



a(h) 2 z 2 z d 



a(h) 2 ZiZ 2 ) 



(4) 



where </j is the standard Gaussian density function, w(h) = a(h)/2 + log(zj / Zi) / a(h) 
and f (h) = a(h) — io(h). The derivation of (jlj) is in Appendix A. 3. 

Observe that a(h) measures the strength of extremal dependence: a(h) — > repre- 
sents complete dependence, and (in the limit) a(h) — > oo indicates complete indepen- 
dence. In accordance with spatial models, the extreme dependen ce between Z(0) and 



Z(h) decreases monotonically and continuously with h = ||tj — tj|| iDe Haan and Pereira 
( 120061 ). and for fixed h the dependence decreases monotonically as a(h) increases. Ac- 
cordingly, characterisation of extremal dependence is determined by the covariance, 
E, which is therefore of interest for inference. 

Due to high-dimensional distributional complexity the study of extremal depen- 
dence is commonly limited to pairwise components through the extremal coefficients 



9(h) 



max{/(x), /(x - h)}dx, 1 < 0(h) < 2. 



The dependence on h is explicit. Specifically for the Gaussian extreme value model, 
9(h) = 2$(a(h)/2), following an argument along the lines of Appendix A. 2. Al- 
ternative models result by considering e.g. exponential or t storm profi le models 
De Haan and Pereira] (120061 ). or stationary Gaussian process profile models ISchlather 
( l2002h . 
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3 Likelihood-based inference 



The analysis of spatial extremes is concerned with the joint modelling of a spatial 
process at large numbers of data-recording stations in a fixed region. As discussed 
in Section [2j the lack of closed-form distribution for max-stable processes in greater 
than K = 2 dimensions precludes straightforward use of standard maximum like- 
lihood methods for this class of models. We now develop inferentially practical, 
likelihood-base d classes of max-stab l e pro cesses derived from a composite-likelihood 
approximation (iLindsayl . Il988l ; IVarinl . 120081 ) . The procedure is sufficiently reliable and 
versatile to permit the simultaneous and consistent modelling of joint and marginal 
parameters in the spatial context at a moderate computational cost. 



3.1 Composite likelihoods 

For a parametric statistical model T with density function family T = {/(y;^),y 

]R d }, and a set of marginal or conditional events {X^ : k e K,} 
(for some K. C N) subset of some sigma algebra on y, the composite log-likelihood is 
defined by 

W;y) = J>g/(ye x*;V0, 

where log/(y G T^ij)) is the log-likelihood associated with event Tf.. First-order 
partial derivatives of ^cCV 7 ! y) with respect to tp yield the composite score function 
D^£c(i/>;y), from which maximum composite likelihood estimator of ip, if unique, is 

obtained by solving D^fcCV'MCLEi y) = 0. Similarly, second-order partial derivatives 
of D^,£ C (V>; y) yield the Hessian matrix H^ c (^>; y) (Appendix A.l). 

The key utility of the composite log-likelihood is it's ability, under the usual regu- 
larity conditions, to provide consistent and unbiased parameter estimates w hen stan- 
dard likelihood estimators are not available. Under appropriate conditions (ILindsayl . 
19881 ; ICox and Reidl . 120041 ) the maximum composite likelihood estimator is consistent 
and asymptotically distributed as 

V'MCLE ~ W, I(^)- 1 ) with I(^) = H(^) J(^)- 1 H(V), 

where H(^) = E{-H^ c (^; Y)} and J(^) = V{D^ C (^; Y)} are analogues of the 
expected information matrix and the variance of the score vector. Although the max- 
imum composite likelihood estimator can be unbiased, it may not be asy mptotically 
efficie nt in that the inverse of the Godambe information matrix iGodambe 



( 119601 ). may not attain the Cramer-Rao bound ICox and Reidl (120041 ) . 



3.2 The pairwise setting for spatial extremes 

Recall, for the spatial setting we have observations {y n ,k}, each denoting the maximum 
of m samples over n = 1, . . . , N blocks and k = 1, ... K locations in a continuous 
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region. E.g. for daily observations, m = 366 implies the y n ^ describe annual maxima. 
Accordingly, the K univariate marginals are approximately GEV distributed. Despite 
the intractability of the multivariate max-stable process, availability of the bivariate 
form ([3]) implies a pairwise composite log-likelihood may be constructed as 

TV K K-l 

MV>;y) = $^5Z Y lo ^f(yn,uyn,j]^), (5) 

n=l i=l j=i+l 

where each f(y n> i, y n j; is a bivariate marginal density based on data at locations % 
and j, taken over all distinct location pairs. 

In order to characterized limiting behavior by a max-stable process (CQ) we require 
unit Frechet marginal distributions. Accordingly, we consider the bijection (Yj, Yj) = 
g(Zi, Zj) with inverse function given by 

+ «\- z i= (i + &e^M) w ( 6) 

where Zj = Z(tj) and Yj = Y(tj), and for each marginal Y, the constants /i, £ and 
A > ensure that Z is unit Frechet distributed. The resulting bivariate density is 

fY^iyuVj) = fz^ [g^ivhVj)] \ J(vi,Vj)\, 

where fzi,z(zi, z j) denotes the density of the Gaussian extreme value model (jlj), and 

i^i = J- ( 1 + to^)" 5 - 1 (, + taT 1/£ '-' 

This change of variable permits the use of GEV marginals (over unit Frechet) with- 
out reforming the problem definition. Hence, the pairwise log-likelihood (jSJ) allows 
simultaneous assessment of the tail dependence parameters (j3J) between pairs of sites 
and also the location, scale and shape parameters of the marginal distribution at 
each location. The parameters can not be estimated as an analytical solution of the 
composite score equation. Nonetheless quasi-Newton numerical maximization rou- 
tines (e.g. Broyden, 1967) can be applied in order to obtain maximum likelihood 
estimates. 

Variances of parameter estimates are provided through inverse of the Godambe 
information matrix, with estimates of the matrices B.(tp) and J(^) given by 

TV K K-l 

h(^mcle) = - Y Y Y H ^ l °sf(yn,i,yn,j;iucLE) 

n=l i=l j=i+l 

and 

TV K K-l 

J(^MCLe) = Y D ^ lo § f(Vn,i> Vnj] V'MCLE) log f(y n>i , y nJ ; ^ M CLe) T > 

n=l i=l j=i+l 
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each evaluated at the composite maximum likelihood value. In practice the matrix H 
is obtained straightforwardly with the numerical maximization routine employed for 
likelihood maximization. An explicit expression for J is derived in Appendix A. 4. 

In principle, estimating unique marginal parameters for each location ensures 
correct model application by respecting marginal constraints, though computational 
issues arise for large numbers of parameters. Alternatively, as is common in the 
modelling of univariate extremes, we may describe the GEV parameters through 
parsimonious regression models, which may be functions of space, environmental and 
other covariates and random effects. Specifically, we may express each parameter as 



t?(x) = h{f{x)} = X/5, 



(7) 



where h is a link function, x vector of predictors, X is a N x (d + 1) 

design matrix, and /3 is a (d + 1) X 1 vector of unknown parameters. 

For further flexibility, a non-parametric approach may provide a useful alternative. 
Non-parame tric modelling of univariate ext r eme v a lue responses has been rec ently 
proposed by IChavez-Demoulin and Davison! (120051). lYee and Stephenson! (120071 ) and 
Padoan and Wandl (120081 ). The work of iKammann and Wandl (120031 ). who developed 
a non-parametric spatial regression with Gaussian response, may be extended to the 
current spatial extremes setting. 



3.3 Model selection 



There are two mod el selec t ion ap proaches under the composite likelihood framework. 
For nested models IVarinl (120081 ) the p-dimensional parameter, if> is partitioned as 



i/j = (i()',if)"), where if)' is g-dimensional, and testing if>' = if) versus a two-sided 
hypothesis proceeds via the composite likelihood ratio test statistic 

W^ ) = 2{£S)-£ c (^"^o))}- 



Under the null iKentl (Il982h 



W 



V iXi ' 

3=1 



where xl are independent xi random variables, and v\ > . . . > v q are the eigenvalues 
of (H^ ^ ) _1 1^'. Here, and and 1^,'^' respectively denote the information 

matrix and the Godambe information matrix, each restricted to those elements asso- 
ciated with parameter if)'. (Dependence on if) under the null is omitted for brevity.) 
Hypothesis testing based on (181) either approxim ates null distribution using estimates 
of the eigenvalues u, i lRotnitzky and Jewell! (119901 ) . o r adjusts the compos i te like lihood 



such that the usual asymptotic xi nu U is preserved IChandler and Bate! (|2007D . 



The composite likelihood information criterion (CLIC) IVarin and Vidonil (120051 ). 
useful in the case of non-nested models, performs model selection on the basis of ex- 
pected Kullback-Leibler divergence between the true unknown model and the adopted 



S 



model (IDavisonl . 12003k p. 123). In the compos ite likelihood con text, this is the AIC 
under model mis-specification (ITakeuchil . Il976l ); (IDavisonl . 120031 . p. 150-152). Model 
selection is based on the model minimising 

-2{ Wmcle; Y) - tr[J(v> MCLE ) h^mcle)- 1 ]}, 

where the second term is the usual composite log-likelihood penalty term. 



4 Simulation Study 

We now evaluate the utility of the composite likelihood in the spatial extremes con- 
text. We examine various forms of extremal dependence with the Gaussian storm pro- 
file fl4]) characterised through the covariance, £, including directional and strength of 
dependence variations (Tableland Figure [1]). The covariance has direct meteorologi- 
cal interpretation and defines the extremal dependence directly. The K site locations 
are uniformly generated over a 40 x 40 region. Given the moderate computational de- 
mand for large site numbers, likelihood maximisation (and other) routines have been 
implemented in C and collected in the forthcoming R package Spat ialExt remes. 



Table 1: Extremal dependence configurations. 





Spatial dependence structure 




a\ 


0"12 


Si 


Same strength in both directions 


300 


300 





s 2 


Different strength in both directions 


200 


300 





s 3 


Spatial correlation 


200 


300 


150 


s 4 


Strong dependence 


2000 


3000 


1500 


s 5 


Weak dependence 


20 


30 


15 




10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40 



Figure 1: A Gaussian extreme value process realisation for Si, . . . , £5. Max-stable process 
simulation routines are available in the RandomFields package in R (Schlather, 2002). 



Table [2] summarizes estimator performance based on moderately sized datasets: 
K = 50 sites and N = 100 observations. Estimate means and standard errors over 
500 data replications are reported, indicating good correspondence to the true values. 
There is no evidence of bias, even in cases where poorer performance may be expected 
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Table 2: Composite MLE's based on 500 spatial extreme data simulations (K = 50 sites and 
N = lOOj with the Gaussian extreme value model. True values are in [brackets]. Standard 
errors obtained through Godambe estimates and sample standard deviation (in parantheses). 





o{ 1 s.e. 


g\ 1 s.e. 


a{ 2 / s.e. 


Si: 


306 [300] / 40.6 (44.7) 


306 [300] / 39.8 (41.5) 


1 [0] / 27.9 (27.7) 


S 2 : 


204 [200] / 26.7 (28.5) 


305 [300] / 39.6 (39.7) 


1 [0] / 21.9 (21.2) 


S 3 : 


202 [200] / 25.1 (26.1) 


300 [300] / 37.3 (37.9) 


150 [150] / 25.5 (26.1) 


S 4 : 


2053 [2000] / 495.2 (300.1) 


3065 [3000] / 664.8 (483.1) 


1550 [1500] / 412.0 (322.4) 


£ 5 : 


20 [20] / 1.5 (1.6) 


30 [30] / 2.3 (2.3) 


15 [15] / 1.6 (1.6) 



such as very strong or weak dependence. Overall, the Godambe standard errors and 
sample standard deviations are consistent, though there is some discrepancy in the 
case of strong dependence. Here, the eigenvalues of Sj 1 are 5 ± Vl~0/7500 and 
so we are near the parameter space boundary. Consequently, for some of the 500 
replications, the asymptotic normality of ^MCLE fails, and so the Godambe standard 
errors are not relevant. 

Table 3: Normalised mean squared error of extremal coefficient estimates based on 500 
data simulations with K = 50 sites and N = 100 observations. Estimators are the com- 
posite MLE and those proposed by Smith (1990) and Schlather and Tawn (2003). Standard 
deviations are reported in parantheses. The values' order of magnitude is 10~ 4 . 





Composite MLE 


Smith 


Schlather & Tawn 




3.1 ( 5.4) 


18.4 (26.7) 


17.3 (29.9) 


S 2 : 


2.9 ( 4.8) 


19.1 (30.3) 


18.9 (28.6) 


S 3 : 


2.5 ( 4.7) 


21.9 (35.7) 


21.1 (32.4) 


E 4 : 


3.0 (35.6) 


10.9 (18.9) 


6.7 (13.3) 


S 5 : 


0.3 ( 0.8) 


30.3 (44.6) 


25.5 (40.7) 



Table [3] depicts normalised mean squared errors for three different estimators of the 
extremal coefficient functions: the co mposite MLE derive d from the Gaussian extrem e 
value model, and those proposed by ISmithl (119900 and ISchlather and Tawn] (120031 ). 
Normalized mean square errors are used to prevent the largest extremal coefficients 
from dominating. From Table [3] in general it is clear that the composite likelihood 
estimator is the most accurate. The standard deviation in the strong dependence 
case again suffers from the failure of the asymptotic normality of V'MCLE- 

Estimator performance for a range of dataset sizes (N = 10, 50, 100, 500) and 
site numbers (K = 10, 50, 100) is listed in Table 01 under 500 data replications of 
spatial dependence model S 3 . As expected, the simulations indicate that there is 
some bias and larger variance for small N, and negligible bias and small variance for 
large N. Observe that, for fixed sample size, the number of sites does not impact the 
estimation results. This behavior is illustrated in Figure [2] highlighting, in particular, 
good estimator performance with increasing N. 
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Table 4: Composite MLEs for a varying number of sites (K) and observations (N), based 
on 500 simulations of spatial extreme data using the Gaussian extreme value model with 
covariance £3 (Table\lty. Standard deviations are reported in parantheses. 



K 


N = 10 


/V — 50 

1 V 'J V J 


N = 100 


N 


= 500 


True 


o\ 10 


245 (120.2) 


207 (43.8) 


205 (31.7) 


199 


(13.3) 


200 


50 


244 ( 90.4) 


208 (37.5) 


200 (28.3) 


199 


(11.4) 




100 


239 ( 94.3) 


205 (37.8) 


202 (30.4) 


199 


(11.5) 




o\ 10 


353 (159.1) 


305 (63.9) 


301 (44.8) 


298 


(19.5) 


300 


50 


353 (131.6) 


309 (56.6) 


303 (44.3) 


298 


(16.9) 




100 


361 (143.4) 


307 (59.5) 


301 (44.8) 


299 (16.7) 




Sia 10 


174 (108.9) 


153 (41.8) 


151 (31.9) 


149 


(13.4) 


150 


50 


179 ( 91.2) 


156 (38.8) 


149 (28.6) 


149 


(11.4) 




100 


181 (100.9) 


154 (39.2) 


150 (29.7) 


150 


(11.2) 





Finally, we consider model selection under misspecification. Figure [3] illustrates 
power curves (likelihood ratio tests) and rejection rates (CLIC) for two hypotheses, 
each versus their complement. Namely, Hq : <j\ = 200 fixing <j\ = 300 and a 12 = 150 
(top plots), and Hq : o\ = erf = 200 fixing o\i = (bottom plots). All resulting curves 
are near-quadratic and, for the pairwise likelihood ratio based tests, rejection rates 
are close to the confidence level a = .05 when the null hypothe sis is true. In this 
study, contrary to t he results derived bylChandler and Bate! (120071 ). the adjustment of 
the W statistic (jHJ) iRotnitzky and Jewell fll990h appears to have slightly more power 
- even when testing mutiple parameters. In contrast the CLIC statistic demonstrates 
poor performance, with the rejection rate under the true model reaching only 20%. 



5 Application to U.S. precipitation data 

We illustrate the developed methodology in an analysis of U.S. precipitation data. 
These data consist of 46 gauging stations with daily rainfall records over a period of 
91 years (Figure HJ). We express the model GEV parameters as simple linear functions 
of longitude, lattitude and altitude. Further model variations, including additional 
environmental covariates, and regressions of the covariance E on these covariates to 
examine spatial variations in extremal dependence were not considered. 

Exploratory analyses of the functional complexity of the GEV parameters was 
performed by fitting independent GEV models to the data from each station and 
evaluating appropriate surface responses using standard techniques (e.g. ANOVA, 
Fisher tests). After identifying an upper limit to model complexity, we perform 
model selection on the pairwise composite likelihood of the Gaussian extreme value 
process using the methodologies outlined in Section 13. 3[ 

Table summarizes some of the different models investigated. According to the 
CLIC criterion, model M 5 (in bold) is the preferred choice, although models M 2 and 
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Table 5: Some Gaussian extreme value processes and their corresponding maximised nega- 
tive composite log-likelihood, degrees of freedom and the CLIC score. 



Model -M^MCLEJy) d.f. CLIC 



M : 


a(x) 
£(x) 


= ao + ai(lat) + a2(alt) + a 3 (lon) 
= fa + A(lat) + & (alt) + #2(1011) 
= 7o 


412,110.2 


12 


12,848,229 


Mi : 


f \ 

H(x) 
a(x) 

e(x) 


= a + ai(latj + a 2 (alt) 

= A) + A(lat) + # 2 (alt) + /3 3 (lon) 

= 7o 


412,110.9 


11 


16,096,068 


M 2 : 


H(x) 
a(x) 


= ct + ai(lat) + a 2 (alt) + a 3 (lon) 
= /5o + A(lat)+/5 2 (alt) 
= 7o 


412,113.3 


11 


1,008,997 


M 3 : 




= a + ai(lat) + a 3 (lon) 

= Pq + /3i(lat) + /5 2 (alt) + #,(lon) 

= 7o 


412,234.1 


11 


1,926,389,242 


M A : 


ji{x) 
a{x) 
i{x) 


= «o + «i(lat) + «2(alt) + a 3 (lon) 
= A) + /3i(lat) + /3 3 (lon) 
= 7o 


412,380.5 


11 


33,209,042 


M 5 : 


/i(x) 
cr(x' 
e(x) 


= «o + «i(lat) + «2(alt) 
= Po + /5i(lat) +#2 (alt) 
= 7o 


412,113.3 


10 


1,008,261 


M 6 : 


fi(x) 
a(x) 


= «o + «i(lat) 

= Po + p 1 {]&t) + p 2 {alt) 

= 7o 


412,237.3 


9 


1,086,347 
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Figure 2: Composite MLE distribution estimates for N = 50 (dotted line), N = 100 (broken 
line) and N = 1000 (solid line) observations over K = 10 (^op row), K = 50 (centre row) 
and K = 100 (bottom row) sites. Contours correspond to 0.25, 0.5 and 0.75 percentiles. 

M 6 also appear competitive. Despite the relatively poor performance of the CLIC 
criterion for model selection (see Section H]), the same conclusions are obtained from 
the deviance based tests, with p-values around 0.9. For model M 5 , the covariance E 
is estimated as a\ = 0.06323 (0.06463), a 12 = 0.01334 (0.03661) and a 2 2 = 0.02581 
(0.02538), where the parantheses indicate standard errors. 

Figure [5] (right plot) illustrates the spatial variation of pointwise 50-year return 
level estimates. Comparison to the regional elevation map (left plot) indicates that 
the most extreme precipitation events occur in mountainous regions. Figure [6] (left 
plot) also depicts the strength of spatial dependence through the extremal coefficient 
function. There is clear evidence of anisotropy, with stronger dependence in the 
north-east/south- west direction. Interestingly, this axis corresponds to the shape of 
the Appalachian Mountains as well as the coastline. Accordingly, this directional 
extremal dependence may be the consequence of storms following either the coastline 
or the massif. Finally, conditioning of a fixed site (and for a given threshold) by using 
the pairwise conditional distribution, the conditional g-year return level estimates can 
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Figure 3: Left panels: Power curves for the composite likelihood ratio tests; RJ (Rotnitzki 
and Jewell, 1990), and CB c hoi and CB sv d (Chandler and Bate, 2007) using Cholesky and 
singular value decompositions. Right panels: CLIC rejection rates. Test levels are a = 0.05. 
Point estimates are based on 1000 data replications. 



easily be provided. Figure [6] (right plot) illustrates the spatial variation of conditional 
pointwise 50-year return level estimates, given the fixed site indicated by the star. 



6 Conclusion 



As a natural generalisation of extremal dependence structures, max-stable processes 
are a powerful tool for the modelling of multivariate extremes. Unfortunately, the 
intractability of the multivariate density function precludes inference except in trivial 
cases (e.g. bivari ate), or requires add i tiona l approximations and immense comput a- 



tional overheads (j Jiang and Turnbulll . 12004 ISisson et all 120071 ; iPeters et all 12008 



This article has developed composite likelihood-based inferential methods for gen- 
eral max-stable processes. Our results demonstrate good applicability in the spatial 
context. The benefits of this likelihood-based approach are the flexible joint modelling 
of marginal and dependence parameters, coupled with good estimator behaviour with 
finite samples, all at moderate computational cost. 

Modifications of the model formulation would draw alternative representations of 
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Figure 4: Locations of the 46 gauging stations. 




Figure 5: Left: Elevation map (metres) of the region; Right: Pointwise 50-year return 
level map (cm) estimated from the fitted Gaussian extreme value process. 




Longitude (degree) Longitude (degree) 



Figure 6: Left: Contour plot of the extremal coefficient; Right: Pointwise 50-year, 
conditional return level map (cm) estimated from the fitted Gaussian extreme value 
process. The star indicates the fixed site used in the computation of the conditional 
return levels. 
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extremal modelling into the composite-likelihood based frame work, g i ven th e known 
links between these and block maxima (GEV) approaches (e.g. IColesI tOOlh ). These 
include threshold excess models for marginals (Davison and Smith, 1990), and the 
limiting Poisson characterisation of extremes. The obvious practical benefit from 
these extensions would be the incorporation of more data into the modelling process. 
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Appendix 



We present explicit expressions for the distribution function ([3]), density function (J4]) 
and the derivatives required for the estimated covariance matrix in Section 13.21 



A.l: Vector notation 

Let / be a real- valued function in the d x 1 vector x = {x\, . . . , Xd)- Then the 1 x d 
derivative vector, D x /(x), has i-th element df(x)/dxi. The corresponding Hessian 
matrix is given by H x /(x) = D x {D x /(x)} T . 

If a = (ai, . . . , ad) and b = (bx, . . . ,bd) are two rfxl vectors, then element-wise 
multiplication is denoted by a b = (ai&i, . . . , a^). The expression a/b denotes 
element- wise division (ai/fei, . . . , ad/bd). Scalar functions applied to vectors are also 
evaluated element-wise. For example, a" 1 ^ = (a x , . . . , cl^^). 



A. 2: Derivation of the bivariate distribution function 

In order to derive the cumulative distribution function ([3]), considering formula 
and the assumptions of Section 12. 3\ we need to solve: 



oo poo 



,'f(x u X 2 ) f(x! -h,X 2 ~t 2 ) . 

t [Zi, Zj) = exp s — / max I , I dxi ax 2 



oo J — oo \ ^% Z j 



OO /"OO 



ex , , , f(x lt x 2 ) ff{x^x 2 ) ^f{x 1 -t 1 ,x 2 -t 2 )' ^ 



OO J — OO \ z j 



OO /"OO 



f(xi -h,X 2 - *2)j / /Ql -ti,X 2 ~ t 2 ) > f(Xi,X 2 ) ' , 



oo 7 -oo z j V 

where f(xi,x 2 ) is the bivariate normal density of (Xi,X 2 ) ~ iV((0, 0) T , £), and for 
brevity we set h = (tj — tj) T = (ti,t 2 ) T . Recall from ([3]) that 

^^ m ) i/2 -7ra(5-^ + 3) 1/2 

where p = cri2/o"iO"2- Consider first the case (t\<j 2 —pt 2 <j\) > 0. Note that /(a;i, x 2 )/zi > 
f(x% — ti, x 2 — t 2 )/zj implies that 

/ 1 (x\ 2p Xl x 2 x 2 2 \\ 

. f 1 / >i~ti) 2 2p{x 1 -t l ){x 2 -t 2 ) {x 2 -t 2 f \\ 

- eXP \ 2(1 -M a? a x a 2 + a| J J '* 

^2 / 1? 2ptit 2 , *| \ , <rft7 2 (l - p 2 ) . Zj x 2 (t 2 aj - pha^) 
^ Xi < — r — h -4- + log — - 



2{t 1 a 2 - pt 2 (Tx) \a\ o x o 2 a\) t x a 2 - pt 2 a x t v o 2 - pt 2 a x 

<=>■ X\ < c. 
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From this, it follows that 



oo POO 



OO J — OO \ "I ~J 



2pxix 2 x 2 . 



r/ vfa) [ <p [ ^JyT ^\ dxi dx2 



2^/(1 -p 2 ) W cr|y v^i 2 ff i^2 <y\) (l-p 2 )- 1 / 2 



i $ / i f±_ w | t 2 y /z | /t 2 1/2 log^-M 



1 $ f a ( h ) , lo s%/^ 



2? V 2 a(h) 
Similarly, 



/(xi - h,x 2 - t 2 ) f(xi,x 2 ) _ . 

> ^ Xi> c. 



It then follows that 



oo roo 



f(xi - £i, £ 2 - t 2 ) j ( f(xi,x 2 ) > /pi -ti,g 2 ~ hY ■ 



oo </ -oo %' V ^3 

1 / 1 />i-ti) 2 2p(x 1 -t 1 )(x 2 -t 2 ) , (x 2 



«/ — oo J c 



ln^/T^fi l 2(l-p 2 )V <J\ <Ti<T 2 



z j J-oo Jc \ (7ia/1 — p 



Z 3 J -oo 



I f 1 ft 2 2pt 1 t 2 t t 2 2 y /2 t (t\ 2pt 1 t 2 t t 2 \ 1/2 log Zi/zj 



:j ^2^/(1 -p 2 ) W ^1^2 o-f/ W ^i^2 o-fy (i-p 2 )- 1/2 

J_ $ f «( h ) , log Zi/Zj 



Zj \ 2 a(h) 

and the form of the distribution is confirm ed. Observe, that the same result is ob- 
tained for the case (ticr 2 — pt 2 a{) < 0. See also lSmithl (119901 ) and lDe Haan and Pereira 
(|2006h . 
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A. 3: Derivation of the bivariate density function 

In order to derive the bivariate density function (j4j) we require the second-order 
derivative of 

$(u) 



F(zi, Zj) = exp 



Z{ Zj 



with respect to zi and Zj, where for brevity we set a = a(h), w = w(h) and v = v(h) 
and write w = a/2 + log(zj/zi)/a and v = a — w. The differentiation gives 



exp 



d_ 2 

dzidzj 

<f>(w) <&(v)\ I d ( <&(w) $(v)\ d ( <fr(w) $(v)\ d 2 ( <fr(w) <f>(v) 



f[Zi,Zj) — F{Zi,Zj 



Z>i Zj J I Q Z'i \^ Z"i Zj J d Zj \^ Z<i Z j J O ^ Z^ Q Z j \^ Z<i Z j 

First-order differentation gives 

d ( $(u>) $(w) (p(w) (f(v) d ( <£>(u>) ${v)\ <p(v) V 9 ( w ) 

dzi \ Zi Zj J z 2 azf aZiZj ' dzj \ Z{ Zj J z 2 az 2 az { Zj ' 

using the results 

d&(w) f(w) d&(v) ip(v) dw 1 dv 1 

dzi azi ' dzi azi dz^ az t ' dzi azi 

Second-order differentation yields 

d 2 f &(w) vip(w) wip(v) 



dzidzj \ Zi Zj J a 2 z 2 Zj a 2 ZiZ 2 ' 
using 

d(p(w) w(p(w) d(p(v) vip(v) 
= and — = . 

ozi azi ozi azi 

Substituting, we obtain the probability density function 



f(zi,Zj) = exp | 



2 2 

Zi, Zj J I y ^% CtZiZ j 



§(v) if(v) (f(w)\ f V if(w) w (p(v) 



2 2 / \ 2 2 22 

Zj QjZ j (XZj^Zj J \^ Ct Z^ Zj Ct Z^Zj 

A. 4: An expression for the squared score statistic 

From Section [372] the term of the Godambe information matrix can be estimated 
from 

K IC-1 

Yl D iA lo s/(y^yi^) T D^,log/(y i ,y i ;V'), 

i=l j=i+l 
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where tp T = (a, /? , (3 X , f3^), a T = (g\ x , <t 12 , and where each parameter f3 is 
p-dimensional vector of coefficients. The bivariate log-density has the form 



where 



C 



A 



log f(Yi, yf, i>) = A + log(B © C + D) + E, 
$fw) $fv) „ $(w) y?(w) <^(v) 



B 



+ 



ip{y) 



z? 



V?(w) 



a0z a0z ; 0Zj 



D 



a0z- a Zj Zj 

V <^(w) W0(y9(v) 



a 2 zf © a 2 z; z 2 



and E = Ioe 



1 + 6 



A, 



1 + 



Ay 



1 



and where 1 T = (1, . . . , 1), /ij = (X^^)*, & = (X^/? e )j and log(^) = (X^^);. 
The GEV parameters are related to the predictors by the form ([7]). We assume 
identity link functions for the location and shape parameters and exponential for the 
scale. The term E corresponds to the log of the determinat of the Jacobian matrix 
associated with the transformation ([6]), see Section [321 

The first-order derivative term of the square score statistic is defined by 

D^log/( yi ,y^) = (D ff log/(^),D^log/(V),D^ log /(^), log/ty)) 

wher e for brevity we write log f{tp) = log/(yj, y.,-; if?). Vector differential calculus 
(e.g. IWandl d2QQ2h ) leads to 



D CT log/(V) 



v (f(w) w Q (p(v) 



a Zj 



a0z 



3 



c© 



> 2 -l)0y?(v) (l + w0v)0(^(w; 



a 2 0z 



a 2 Zj © z,- 



+ B© 



V 2 -1)0 y?(w) (1 + w v) © <p(v) \ (w - 2v - w v 2 ) © <^(w 



a 2 zf 



+ 



'v - 2w - v w 2 ) ip(y) 
a 3 z 2 z. 



a 2 © Zj © Zo 



/(B0C + D) 



a 3 zf z,- 



where s T = (t 2 , 2tit 2 , ^2)5 using the results 
Daw = - s , 



w © vyj(w) 



, / \ W©(1— V 2 )W(W) T 
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The first-order derivativies of v, $(v), <p(v) and w0<^(v) are the same as the above, 
substituting v for w. Similarly for the second term we have 



log/(^) = 



y?(w) + a $(w) v( v ) 
(p(v) + a0$(v) 



a Zj © zj 



© 



A, 



p(w) 



a©z| 



a Zj © z 







z^ 



A; 



fx 



+ 



p ^ | w0$(v) v0$(w) . 

U I a 2 0z I0 z 2 + a^Oz? I W 



A, 



+ 



c© 



(B C + D) 



(a+w)Qy(w) (2a+w)0y(v) 



fx 



a^0z;0z^ 



a 2 0z| 



(B C + D) 



fx 



B0 



I a 2 0z^0zf a 2 Oz 



(a+v)0y(v) _ (2a+v)0i^(w) _ 2$(w) 







A* 



(B C + D) 



fx 



+ 



B0 



v0$(w) w0$(v) 







(B0C + D) 



fx 



+ 



(l-a0v-v 2 )0</?(w) (l-a0w-w 2 )0y(v) 



a 2 0Zj0z? 



a^0z;0z 



3" 







A, 



(B0C + D) 



fx 



(l-a0w-w 2 )0y(v) 
a 2 Zi 0z? 



(l-a0v-v 2 )0p(w) ^ q fj_ 



a 2 0Zj0zf" 



(B C + D) 



fx 



0ji 



+ 



A,;Z^ 



■( X ^) 



A,z^ 



Observe that the above expression is obtained by deriving in order the components: 
A, Dp log(B C + D) and E. These three components have the form 

D^/(x) = D z ./(x)D^Zi + D Zj ./(x)D^z i , where z; = z^/A^X^. For this 
reason the derived expressions of log /(V 7 ) and D^log / (if}) are essentially the 
same but substituting z, with DpZi and Dp z t , and E with D^E and E. 



23 



We have 



A, 



(X^ and D« Zi 



A; 



Zilog(zi) } (Xfl )i. 



Finally, 



(e,-i)( yi -^i) 



AiZi 



i )^>+( (o " i Sr M) - i ) (x ^ and 



6 



zfA> 



+ 



I J (l-£;)(y;-/^-l) 



z^A, 



log(z. 



- !og( z i) 







(X/?J r 



In combination we obtain an expression for D^, log /(y«, y^; if>), and from this the 
squared score statistic. 
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